Methods and Systems for Network Element Fault Information Processing

ABSTRACT

Embodiments of the present invention relate to systems and methods for network element fault information processing. In an embodiment, a network element identifier and a network element fault information processing instruction are received. A query for network element fault information based at least in part on the network element identifier is sent. Network element fault information is received. The network element fault information is processed based at least in part on the received network element fault information. The processed network element fault information is output.

BACKGROUND OF THE INVENTION

1. Field of the Invention

Embodiments of the present invention relate to network communicationssystems. More particularly, embodiments of the present invention relateto systems and methods for network element fault information processing.

2. Background Information

A known telecommunications management system is the NavisCore™ ElementManagement System from Lucent Technologies of Murray Hill, N.J.NavisCore is a centralized service and network management application.It can provide standards-based management and control oftelecommunications network elements of networks such as frame relay,Switched Multimegabit Data Service (“SMDS”), Asynchronous Transfer Mode(“ATM”), and Internet Protocol (“IP”) switch networks. NavisCoreincludes a distributed and multiservice element manager and is agraphically integrated UNIX-based platform that provides a networkmanagement solution based on Telecommunications Network Management(“TNM”) standards.

NavisCore can speed circuit provisioning with point-and-click operationsto establish end-to-end network connectivity and provides a variety oftraps for alarm indications and statistics logic for elements in theswitch network such as switches, trunks, physical ports, logical ports,permanent virtual circuits (“PVCs”), switched virtual circuits, and soon. A NavisCore user can use network statistics for real-time statusinformation on logical and physical ports and view key usage data onsuch interfaces for network planning and trend analysis. Network elementfaults (i.e., network faults) can be reported to a central repositorywhere a NavisCore operator can access the network element faultinformation.

A central repository receiving the reported network element faults istypically one or more NavisCore servers. The NavisCore servers recordthe reported element network faults in files typically called trap logs.Examples of trap log information are as follows:

-   -   985885337 1 Thu Mar 29 12:02:17 2001 NWORLAMABB1- Switch        nworlamabb1 interface up (SNMP linkUp trap) on LPort        60QGDA500180_LMC(14,7);1.1.3.6.1.4.1.277.10.0.3 0; and

985885337 7 Thu Mar 29 12:02:17 2001 NWORLAMABB1- LPort60QGDA500180_LMC(14,7) at switch nworlamabb1 is up with Customer NameSUPERS_SUPERMARKET.;3 .1.3.6.1.4.1.277.10.0.30 0; and

985885533 7 Thu Mar 29 12:05:33 2001 NWORLAMABB1- LPort60QGDA500180_LMC(14,7) in switch nworlamabb1 is up, following PVCs isalso up: HOUM_SUPERS_SUPERMARKE_(—)100_(—)99NWOR_SUPERS_SUPERMARKET_(—)100_(—)101 NWOR_SUPERS_SUPERMARKET_(—)100_(—)103 NWOR_SUPERS_SUPERMARKET_(—)100_(—)104NWOR_SUPERS_SUPERMARKET_(—)100_(—)105 NWOR_ROUSES_SUPERMARKET_(—)100_(—)106 NOWR_SUPERS_SUPERMARKET_(—)100_(—)107NWOR_SUPERS_SUPERMARKET_(—)100_(—)108NWOR_SUPERS_SUPERMARKET_(—)100_(—)109NWOR_SUPERS_SUPERMARKET_(—)110_(—)110 NWOR_SUPERS_MR KT_NNI_BTR_(—)202_(—)758 HOUM_SUPERS_SUPERMARKET_(—)203_(—)203 .;1 .1.3.6.1.4.1.277.10.0.10009.

Trap logs and other network element fault information can indicatenetwork fault events but are cumbersome to search and display. Forexample, if a customer of a network services provider complains aboutservice problems (e.g., loss of service, degrading of service levels,and so on) a network technician can retrieve and sequentially review thetrap logs to determine information about the customer's networkservices. In view of the foregoing, it can be appreciated that asubstantial need exists for systems and methods that can advantageouslyprovide for network element fault information processing.

BRIEF SUMMARY OF THE INVENTION

Embodiments of the present invention relate to systems and methods fornetwork element fault information processing. In an embodiment, anetwork element identifier and a network element fault informationprocessing instruction are received. A query for network element faultinformation based at least in part on the network element identifier issent. Network element fault information is received. The network elementfault information is processed based at least in part on the receivednetwork element fault instruction. The processed network element faultinformation is output.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of an embodiment of the present invention.

FIG. 2 illustrates a system in accordance with an embodiment of thepresent invention.

FIG. 3 illustrates a method in accordance with an embodiment of thepresent invention.

FIG. 4 shows an example of processed network element fault informationin accordance with an embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

According to an embodiment of the present invention, a system receivesan instruction (e.g., from a user) to retrieve network element faultinformation from a server that stores, for example, network faultinformation corresponding to a network element. For example, the servercan be coupled to one or more network elements such as a network switch,a network circuit, a network path, and so on. When a network faultrelated to a network element is detected, information corresponding tothe network fault can be sent to the server for storage in a trap log.The network element fault information can include, for example, anidentifier of the network element, a fault type identifier correspondingto the type of network fault, a time and/or date identifier associatedwith the time and/or date of the network fault, and so on.

The instruction to retrieve network element fault informationcorresponding to a network element can include a network elementidentifier corresponding to the network element. For example, a user caninput a network element identifier, and the network element identifiercan be a permanent virtual circuit (“PVC”) identifier, a logical port(“LP”) identifier, and so on. One or more trap logs corresponding to thenetwork element can be queried for and received. The instruction toretrieve the trap log information can also include a requested faulttype identifier that specifies which type of fault type informationshould be output (e.g., displayed, printed, etc.) to the user. Forexample, a user may be interested in displaying network faultinformation associated with network element transitions to an up state.The trap log can be analyzed (e.g., searched, parsed, etc.) to determinenetwork fault information associated with the requested fault typeidentifier, and the determined network fault information can bepresented to the user.

FIG. 1 is a schematic diagram of an embodiment of the present invention.Network 101 includes interior switches 110, 111, 112, and 113 (e.g.,core switches, etc.), which are coupled together. As used to describeembodiments of the present invention, the term “coupled” encompasses adirect connection, an indirect connection, or a combination thereofMoreover, two devices that are coupled can engage in directcommunications, in indirect communications, or a combination thereof.Network 101 can be a data network such as a frame relay network, an SMDSnetwork, an ATM network, an IP network, a Multiprotocol Label Switching(“MPLS”) network, an X.25 network, and so on. Accordingly, switches110-113 can be frame relay switches, SMDS switches, ATM switches, IPswitches, MPLS switches, X.25 switches, and so on.

Network 101 can also include switches such as an edge switch 120, whichis coupled to one or more interior switches such as, for example,interior switches 111 and 113. An edge switch 120 is typically the firstpoint of user (e.g., customer) access to a network and the final pointof exit for communications to the user. Edge switches are typicallyinter-coupled by interior switches such as switches 110-113. Edge switch120 is coupled to customer premises equipment 165 of a customer location160 via a communications link 166. Examples of customer premisesequipment 165 include a switch, a router, a network interface device, amodem, a cable modem, a digital subscriber line (“DSL”) modem, and soon. Examples of communications link 166 include a 56K line, a 64K line,a T1 connection, a T3 connection, and so on. Further examples ofcommunications link 166 include a PVC, a permanent virtual path (“PVP”),and so on. In an embodiment, communications link 166 is associated witha logical port.

FIG. 2 illustrates a system in accordance with an embodiment of thepresent invention. Network faults can occur with respect tocommunications between edge switch 120 and CPE 165 via communicationslink 166. For example, communications to and/or from edge switch 120 canbounce, the communications link 166 can go down and subsequently up, CPE165 may not function properly, and so on. In an embodiment in whichnetwork 101 of FIG. 1 is a frame relay network, frame errors can occur.In an embodiment in which network 101 is, for example, an ATM network ora frame relay network, communications link 166 can be a PVC or PVP thatgoes down periodically. In a further embodiment, communications link 166can include one or more communication sub-links 167 (e.g., cables,wires, optical fibers, circuits, etc.) and/or associated equipment suchas amplifiers, multiplexers (“MUXs”), and so on. A network fault can bea faulty amplifier, a failed MUX, a short circuit, and so on. Based inpart on the description of embodiments of the present invention providedherein, other examples of network faults will be apparent to one ofskill in the art.

Edge switch 120 can determine network faults corresponding to edgeswitch 120, communications link 166, and/or CPE 165 and report thenetwork faults to server 150. For example, edge switch 120 can send anetwork fault report (e.g., a network fault message) to server 150 afterdetermining one or more network faults. Edge switch 120 can includeparameters that can determine at least in part when network faultconditions are to be reported to server 150. For example, edge switch120 can include parameters that establish that certain types of networkfaults (e.g., frame errors, down errors, etc.) are reported to server150. Edge switch 120 can also include parameters that establish athreshold severity or persistence of a network fault that will result inreporting the network fault to server 150. For example, a down error canbe reported to server 150 when it lasts more than five (5) cycles.Alternatively, individual frame errors may not be reported to server150, but after a threshold value related to frame errors is met orexceeded, a frame error network fault report can be sent to server 150.For example, a certain number of frame errors in a period of time cantrigger sending of a frame error network fault report to server 150.

Server 150 can store received network fault reports received from edgeswitch 120. For example, server 150 can store network fault reportsreceived from edge switch 120 in a directory where each network faultreport identifies a type of network fault, a date and/or time associatedwith the network fault, a network element associated with the networkfault, and so on. In another embodiment, server 150 can store networkfault reports received from edge switch 120 in a network fault reportfile where corresponding network fault reports are stored so that thenetwork fault report file includes a history of network fault reports.In an embodiment, the network fault report file includes network faultreports from a long-term period (e.g., days, weeks, months, years,etc.). In another embodiment, the network fault report file is a bufferfile that includes network fault reports for a rolling period of time(e.g., the previous five hours, the previous four days, the previousweek, the previous 30 days, and so).

In an embodiment of the present invention, server 150 can be coupled toedge switch 120 via interior switch 113 and communication paths 221 and222. In another embodiment, server 150 can be coupled to edge switch 120via a communications path 226. Server 150, in an embodiment, can be aNavisCore server that receives network management information from edgeswitch 120. According to an embodiment, edge switch 120 includes a trap,which is a mechanism permitting a device to send (e.g., automatically,autonomously, periodically, etc.) an alarm for certain network events toa management station. Typically, network management information isacquired by polling network nodes on a periodic (e.g., regular) basis,but such a network management information acquisition strategy can bemodified when a trap is set from a network device. With traps, a networkelement (e.g., a network node) alerts the management of an event (e.g.,a network fault, a routine event, a catastrophic event, etc). Themanagement station, in an embodiment, can initiate a polling sequence tonetwork elements (e.g., nodes) to determine potential causes of theevent. Such a trap and poll strategy is often called trap-directedpolling.

Network management personnel often become aware of network faults when acustomer (e.g., an owner or operator of CPE 165) complains that itscommunications are degraded or inoperative. To diagnose why the customercommunications are degraded or inoperative, network management personneloften examine network fault reports stored on server. For example,network management personnel can examine network fault informationstored on server 150 to ascertain whether there are any network faultsassociated with communications link 166, edge switch 120, and/or CPE165.

Known methods of analyzing the network fault information relate toexamining the native (e.g., unprocessed) network fault reports stored onserver 150. For example, a network management technician can examineindividual network fault reports (e.g., trap logs stored on a server)such as the following:

-   -   985885337 1 Thu Mar 29 12:02:17 2001 NWORLAMABB1- Switch        nworlamabb1 interface up (SNMP linkUp trap) on LPort        60QGDA500180 _LMC(14,7);1.1.3.6.1.4.1.277.10.0.10003 0;    -   985885337 7 Thu Mar 29 12:02:17 2001 NWORLAMABB1- LPort        60QGDA500180_LMC(14,7) at switch nworlamabb1 is up with Customer        Name SUPERS_SUPERMARKET.;3 .1.3.6.1.4.1.277.10.0.30 0; and    -   985885533 7 Thu Mar 29 12:05:33 2001 NWORLAMABB1- LPort        60QGDA500180_LMC(14,7) in switch nworlamabb1 is up, following        PVCs is also up: HOUM_SUPERS_SUPERMARKE_(—)100_(—)99        NWOR_SUPERS_SUPERMARK ET_(—)100_(—)101        NWOR_SUPERS_SUPERMARKET_(—)100 _(—)103 NWOR_SUPERS_SU        PERMARKET_(—)100_(—)104 NWOR_SUPERS_SUPERMARKET_(—)100_105        NWOR_R OUSES_SUPERMARKET_(—)100_(—)106        NOWR_SUPERS_SUPERMARKET_(—)100_(—)107        NWOR_SUPERS_SUPERMARKET_(—)100_(—)108 NWOR_SUPERS_SUPERMARKE        T_(—)100_(—)109 NWOR_SUPERS_SUPERMARKET_(—)110_(—)110        NWOR_SUPERS_MR KT_NNI_BTR_(—)20 2_(—)758        HOUM_SUPERS_SUPERMARKET_(—)203_(—)203.; 1.1.3.6.1.4.1.277.10.        0.10009.

Native network fault information, such as the examples above, aredifficult to review and analyze. The native network fault informationprovides required information, but also can include too much informationwhen attempting to troubleshoot a fault or problem in a communicationsnetwork. In an embodiment of the present invention, network faultinformation is identified, retrieved, and processed to aid in networkmanagement analysis and operations. For example, a network element canbe identified, a network fault type can be specified, network elementfault information associated with the network element can be retrieved,and the network fault information can be processed based in part on thespecified network fault type.

FIG. 3 illustrates a method in accordance with an embodiment of thepresent invention. A user can be prompted to enter a network elementidentifier (step 305). For example, a network element identifier can bea switch identifier, a circuit identifier, a communications linkidentifier, a PVC identifier, a logical port identifier, a combinationthereof, and so on. In an embodiment, a user can be prompted to enterone or more network element identifiers such as a switch identifier anda circuit identifier. In another embodiment, a user can be prompted toenter a trap log identifier, a NavisCore server identifier, acombination thereof, and so on. In an embodiment, the user is promptedto enter a network element identifier by a computer program or asoftware script executing on a computer.

For example, referring again to FIG. 2, a computer 290 can be coupled toserver 150. In an embodiment, computer 290 includes a processor and amemory. The processor can be, for example, an Intel Pentium® 4processor, manufactured by Intel Corp. of Santa Clara, Calif. As anotherexample, the processor can be an Application Specific Integrated Circuit(ASIC). The memory may be a random access memory (RAM), a dynamic RAM(DRAM), a static RAM (SRAM), a volatile memory, a non-volatile memory, aflash RAM, polymer ferroelectric RAM, Ovonics Unified Memory, magneticRAM, a cache memory, a hard disk drive, a magnetic storage device, anoptical storage device, a magneto-optical storage device, or acombination thereof. The memory of computer 290 can store a plurality ofinstructions adapted to be executed by processor of computer 290.

In an embodiment, computer 290 is coupled to server 150 via a network295 and a network connection (e.g., data port, input/output port, etc.).Examples of a network 295 include a Wide Area Network (WAN), a LocalArea Network (LAN), the Internet, a wireless network, a wired network, aconnection-oriented network, a packet network, an Internet Protocol (IP)network, or a combination thereof.

Referring again to FIG. 3, a network element identifier can be received(step 310). The user can be prompted to enter a network element faultinformation processing instruction (step 315). For example, a networkelement fault information processing instruction can be an instructionto display transitions to a down state, an instruction to displaytransitions to an up state, an instruction to display a log oftransitions for a period of time (e.g., the previous day, the previousfour days, the previous week, etc.), an instruction to display networkelement fault information in a simplified format, an instruction todisplay network element fault information in the native format, acombination thereof, and so on. The network element fault informationprocessing instruction can be received (step 320). Based at least inpart on the received network element identifier, network element faultinformation corresponding to the network element identified by thenetwork element identifier can be received. For example, the networkelement fault information can be queried for from a NavisCore serverassociated the identified network element. In another embodiment of thepresent invention, the network element fault information is queried fromthe network element. In a further embodiment of the present invention,the identified network element receives the query for the networkelement fault information and sends the query to a network managementstation (e.g., a server) that receives network fault information fromthe network element.

After the network element fault information is received, it can beprocessed based at least in part on the received network element faultinformation processing instruction (step 330). For example, when thereceived network element fault information processing instruction is aninstruction to display transitions to an up state, the network elementfault information can be processed to isolate and/or summarize thenetwork fault information corresponding to transitions to an up state.As another example, when the received network element fault informationprocessing instruction is an instruction to display transitions to adown state, the network element fault information can be processed toisolate and/or summarize the network fault information corresponding totransitions to a down state. In another example, when the receivednetwork element fault information processing instruction is aninstruction to display a log of transitions for a period of time, thenetwork element fault information can be processed to isolate and/orsummarize the network element fault information corresponding totransitions to an up state and transitions to a down state. After thereceived network element fault information has been processed based atleast in part on the received network element fault informationprocessing instruction, the processed network element fault informationcan be output (step 335). For example, the processed network elementfault information can be output to a printer, to a display device (e.g.,a Cathode Ray Terminal (“CRT”) display, a Liquid Crystal Diode (“LCD”)display, a video display, a text display, a dumb terminal, etc.), to apersonal digital assistant (“PDA”), to a combination thereof, and so on.

FIG. 4 shows an example of processed network element fault informationin accordance with an embodiment of the present invention. A user canspecify a network element identifier and a network element faultinformation processing instruction. Network element fault informationcorresponding to the network element identifier can be queried for andreceived, and then processed based at least in part on the networkelement fault information processing instruction. For example, a usercan specify that information about specific types of network elementfaults for a network element be identified and summarized.

In an embodiment of the present invention, a data record 400 isgenerated based on a specified processing of network element faultinformation. Data record 400 can correspond to a network element such asa network switch, a network circuit, a network logical port, acombination thereof, and so on. Data record 400 can include one or moreentries 410, and each entry of at least a subset of the one or moreentries 410 can include one or more chronological data fields to storechronological data. For example, data record 400 can include an entry410 including a month identifier field 411 to store a month identifierand a date identifier field 412 to store a date identifier. Entry 410 ofdata record 400 can also include one or more network fault indicatorfields. For example, entry 410 can include a down state field 413 tostore a down state value (e.g., a number of times a network element wasin a down state, a boolean value indicating whether the network was in adown state, etc.), an up state field 414 to store an up state value, aframe error state field 415 to store a frame error state value, and aremainder network fault state field 416 to store a remainder networkfault state value.

For example, data record 400 is generated based at least in part onnative network fault information received from a network managementstation (e.g., trap log information from a NavisCore server). The nativenetwork fault information is processed to identify and/or summarizednetwork faults. Data record 400 indicates, for example, that the networkelement corresponding to data record 400 reported the following numberof down network faults for March 26 through March 30 respectively: 0,18, 122, 106, and 94. The network element also reported the followingnumber of up network faults for March 26 through March 30 respectively:0, 20, 86, 132, and 144. The network element did not report any frameerrors or any remainder network faults for March 26 through March 30.According to alternative embodiments of the present invention, a datarecord can be generated that includes processed network element faultinformation corresponding to one or more network faults over varyingperiods of time (e.g., day by day, hour by hour, week by week, minute byminute, and so on).

Embodiments of the present invention relate to data communications viaone or more networks. The data communications can be carried by one ormore communications channels of the one or more networks. A network caninclude wired communication links (e.g., coaxial cable, copper wires,optical fibers, a combination thereof, and so on), wirelesscommunication links (e.g., satellite communication links, terrestrialwireless communication links, satellite-to-terrestrial communicationlinks, a combination thereof, and so on), or a combination thereof. Acommunications link can include one or more communications channels,where a communications channel carries communications. For example, acommunications link can include multiplexed communications channels,such as time division multiplexing (“TDM”) channels, frequency divisionmultiplexing (“FDM”) channels, code division multiplexing (“CDM”)channels, wave division multiplexing (“WDM”) channels, a combinationthereof, and so on.

In accordance with an embodiment of the present invention, instructionsadapted to be executed by a processor to perform a method are stored ona computer-readable medium. The computer-readable medium can be a devicethat stores digital information. For example, a computer-readable mediumincludes a compact disc read-only memory (CD-ROM) as is known in the artfor storing software. The computer-readable medium is accessed by aprocessor suitable for executing instructions adapted to be executed.The terms “instructions adapted to be executed” and “instructions to beexecuted” are meant to encompass any instructions that are ready to beexecuted in their present form (e.g., machine code) by a processor, orrequire further manipulation (e.g., compilation, decryption, or providedwith an access code, etc.) to be ready to be executed by a processor.

Systems and methods in accordance with an embodiment of the presentinvention disclosed herein can advantageously process (e.g., analyzeand/or summarize) network element fault information. A user canidentifier a network element by inputting one or more network elementidentifiers. The user can also specify a network element faultinformation processing instruction. The network element faultinformation can be processed based at least in part on the specifiednetwork element fault information processing instruction.

Embodiments of systems and methods for network element fault informationprocessing have been described. In the foregoing description, forpurposes of explanation, numerous specific details are set forth toprovide a thorough understanding of the present invention. It will beappreciated, however, by one skilled in the art that the presentinvention may be practiced without these specific details. In otherinstances, structures and devices are shown in block diagram form.Furthermore, one skilled in the art can readily appreciate that thespecific sequences in which methods are presented and performed areillustrative and it is contemplated that the sequences can be varied andstill remain within the spirit and scope of the present invention.

In the foregoing detailed description, systems and methods in accordancewith embodiments of the present invention have been described withreference to specific exemplary embodiments. Accordingly, the presentspecification and figures are to be regarded as illustrative rather thanrestrictive.

1. A system for network element fault information processing, the systemcomprising: a network element; a first communications link coupled tothe network element, the first communications link to carrycommunications to and from a customer; and a computer, the computercoupled to the network element, the computer including a processor and amemory, the memory storing a plurality of instructions to be executed bythe processor, the plurality of instructions including instructions toreceive a network element identifier, the network element identifier ccorresponding to the network element, receive a network element faultinformation processing instruction; receive network element faultinformation; and process the network element fault information based atleast in part on the received network element fault informationprocessing instruction. 2-43. (canceled)